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Ph ' Abstract 



In the on-Une nearest-neighbour graph (ONG), each point after the first in a sequence 
of points in R'^ is joined by an edge to its nearest-neighbour amongst those points that 
precede it in the sequence. We study the large-sample asymptotic behaviour of the total 
power-weighted length of the ONG on uniform random points in (0, l)'^. In particular, 
for d = 1 and weight exponent a > 1/2, the limiting distribution of the centred total 
weight is characterized by a distributional fixed-point equation. As an ancillary result, we 
\G • give exact expressions for the expectation and variance of the standard nearest-neighbour 

i (directed) graph on uniform random points in the unit interval. 
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> : 1 Introduction 



Spatial graphs, defined on random point sets in Euclidean space, constructed by joining 
I nearby points according to some deterministic rule, have been the subject of considerable 

recent interest. Examples of such graphs include the geometric graph, the minimal-length 
spanning tree, and the nearest-neighbour graph and its relatives. Many aspects of the 
large-sample asymptotic theory for such graphs, which are locally determined in a certain 
sense, are by now quite well understood. See for example piU IT^ ITU IT^ [T71 051 OB] . 

Many real-world networks have several common features, including spatial structure, 
local construction (nearby points are more likely to be connected), and sequential growth 
(the network evolves over time via the addition of new nodes). In this paper our main 
object of interest is the on-line nearest-neighbour graph, which is one of the simplest 
models of network evolution that captures some of these features. We give a detailed 
description later. Recently, graphs with an 'on-line' structure, i.e. in which vertices are 
added sequentially and connected to existing vertices via some rule, have been the subject 
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of considerable study in relation to the modelling of real- world networks. The non-rigorous 
literature is extensive (see for example (HI for surveys), but rigorous mathematical 
results are fewer in number, even for simple models, and the existing results concentrate 
on graph-theoretic rather than geometric properties (see e.g. 

The on-line nearest-neighbour graph (or ONG for short) is constructed on n points 
arriving sequentially in R"^ by connecting each point to its nearest neighbour amongst 
the preceding points in the sequence. The ONG was apparently introduced in [3] as a 
simple growth model of the world wide web graph (for d = 2). When d = 1, the ONG 
is related to certain fragmentation processes, which are of separate interest in relation 
to, for example, molecular fragmentation (see e.g. [5, and references therein). The ONG 
in d = 1 is related to the so-called 'directed linear tree' considered in ^^l- The higher 
dimensional ONG has also been studied . Figure ^ shows a realization of the ONG on 
50 simulated random points in the unit interval. Figure below shows realizations of the 
planar and three-dimensional ONG, each on 50 simulated uniform random points. 
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Figure 1: Realization of the ONG on 50 simulated uniform random points in the unit interval. 
The vertical axis gives the order in which the points arrive, and their position is given by the 
horizontal axis. 

We consider the total power-weighted length of the ONG on uniform random points 
m (0,1)"^, N. We are interested in large-sample asymptotics, as the number of points 
tends to infinity. Explicit laws of large numbers for the random ONG in (0, 1)"^ are given 
in In the present paper we give further results on the limiting behaviour in general 
dimensions d. 

The main part of the present paper is concerned with convergence in distribution re- 
sults for the ONG. We give detailed properties of the random ONG on uniform random 
points in the unit interval {d = 1), and identify the limiting distribution of the centred 
total power- weighted length of the graph. When the weight exponent a is greater than 
1/2, this distribution is described in terms of a distributional fixed-point equation reminis- 
cent of those encountered in, for example, the analysis of stochastic 'divide-and-conquer' 
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Figure 2: Realizations of the ONG on 50 simulated uniform random points in the unit square 
(left) and the unit cube (right). 

or recursive algorithms. Such fixed-point distributional equalities, and the recursive al- 
gorithms from which they arise, have received considerable attention recently; see, for 
example, HH iH |12] • 

On the other hand, we believe that for a G (0, 1/2] the total weight, suitably centred 
and scaled, satisfies a central limit theorem (CLT). Penrose ^1] gave such a result for 
a E (0, 1/4). We believe that it should be possible to derive the CLT for all a G (0, 1/2] 
via the divide-and-conquer methods of this paper. The main difficulty is to show that the 
variance of the total weight of the graph scales appropriately in the large sample limit. 
We hope to address this in future work. 

In this paper we also give new explicit results on the expectation and variance of 
the standard one-dimensional nearest-neighbour (directed) graph, in which each point is 
joined by a directed edge to its nearest-neighbour, on uniform random points in the unit 
interval. This is related to our results on the one-dimensional ONG via the theory of 
Dirichlet spacings, which we make use of in our analysis. 



2 Definitions and main results 

Let A' be a finite sequence of points in R'^, and let || • || be the Euclidean norm. For d G N, 
let 

^;,:=^^/2[r(l + (d/2))]-\ (1) 

the volume of the unit d-ball (see e.g. equation (6.50) of [HI). 

Define t/; to be a weight function on edges, assigning weight ti'(x, y) to the edge between 
X G R'^ and y e R"^, such that tj; : R"^ x R^ ^ [0, oo). A case of particular interest is 
when the weight is taken to be power- weighted Euclidean distance. In this case, for some 
a > 0, we have the weight function 

i/;„(x,y) := llx-yir, (2) 

for x, y G R"^. 
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2.1 The on-line nearest-neighbour graph 

We now give a formal definition of tlie on-line nearest-neighbour graph (ONG). Let d € N. 
Suppose xi,X2,... are points in (0,1)'^, arriving sequentially; the ONG on vertex set 
{xi, . . . , x„} is formed by connecting each point Xj, i = 2, 3, . . . , n to its nearest neighbour 
(in the Euclidean sense) amongst the preceding points in the sequence (i.e. xi, . . . ,Xj_i), 
using the lexicographic ordering on R'^ to break any ties. We call the resulting tree the 
ONG on (xi,X2, . . . ,x„). 

From now on we take the sequence of points to be random. Let Ui,U2,... be a 
sequence of independent uniform random vectors on (0, 1)*^. Then for n S N take Un = 
(Ui, U2, . . . , U„), the binomial point process consisting of n independent uniform random 
vectors on (0, l)*^. Denote the ONG constructed on Un by ONG(Z//„). We restrict our 
analysis to the case of uniformly distributed points. Note that, with probability one, Un 
has distinct inter-point distances so that the ONG on Un is almost surely unique. 

The ONG is of interest as a natural growth model for random spatial graphs; in 
particular it has been used (with ti = 2) in the context of the world wide web graph (see 
ini). In Jl], stabilization techniques were used to prove that the total length (suitably 
centred and scaled) of the ONG on uniform random points in (0, 1)*^ for d > 4 converges 
in distribution to a normal random variable. It is suspected that a CLT also holds for 
d = 2, 3, 4. On the other hand, when d = 1, the limit is not normal, as demonstrated by 
Theorem 12.21 fii) below. 

For d G N and a > 0, let 0'^'°^{Un) denote the total weight, with weight function Wa as 
given by (jSJ, of ONG(Z^„). Our results for the ONG in general dimensions are as follows, 
and constitute a distributional convergence result for a > d, and asymptotic behaviour of 
the mean for a = d. For the sake of completeness, we include the law of large numbers 
for a < d from |^ as part (i) of the theorem below. 

Theorem 2.1 Suppose d G N. We have the following: 

(i) Suppose < a < d. Then, as n —> 00 

ni'^~d)/d^d,a^^^-^ _l_^-"/'^r(l + (a/d)). (3) 

(ii) Suppose a > d. Then, as n ^ 00, 

0'''''{Un)^W{d,a), (4) 

where the convergence is in , (p £ N), and almost sure, and W{d,a) is a nonneg- 
ative random variable with E[{W{d,a))^] < 00 for /c G N. 

(Hi) Suppose a = d. Then, as n —> 00, 

E[0'^''^{Un)] = v^^ log n + o(log n). (5) 

In particular © implies that E[0^'^{Un)] ~ (1/2) log n, a result given more precisely in 
Proposition 12.11 below. We prove Theorem 12.11 fii) and (iii) in Section |21 

Now we consider the particular case of the ONG in d = 1, where Un is now a sequence 
of independent uniform random points in the unit interval (0,1). Let 7 denote Euler's 
constant, so that 7 « 0.57721566 and 

(^Et) -logfc = 7 + 0(A:-i). (6) 
The following result gives the expectation of the total weight of ONG(Z//„). 
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Proposition 2.1 As n ^ oo, we have 



E[0''^{Un)] = Ii^±ll2---n^'" + - - ^""f "^ +0(n'"); (0 < a < 1) 
i — a a a[i — a) 

Ep^'Hl^n)] = ^logn + |-i + o(l); 

a[a + 1) \ a — I J 
Proof. The proposition follows from Proposition 14.21 with Lemma 14.21 □ 

In Theorem 12. 21 below, we present our main convergence in distribution results for the 
total weight of the ONG (centred, in some cases) in d = 1. The limiting distributions are 
of different types depending on the value of a in the weight function (jj)). In this paper, we 
restrict attention to a > 1/2, and we define these limiting distributions in Theorem 12.21 
in terms of distributional fixed-point equations (sometimes called recursive distributional 
equations, see |2| ) ■ These fixed-point equations are of the form 

k 

X^J^AXM+S, (7) 

r=l 

where k e N, X^^'^r = l,...,k, are independent copies of the random variable X, 
and {Ai, . . . , Ak, B) is a random vector, independent of {X^^^ , . . . , X^'^^), satisfying the 
conditions 

k 

E'^\Ar\'^<l, E[B]=0, E[B^] <oo. (8) 

r=l 

Theorem 3 of Rosier j^l] (proved by the contraction mapping theorem; see also |11[ V2'2\ ) 
says that if (jSl) holds, there is a unique square-integrable distribution with mean zero 
satisfying the fixed-point equation ((J)), and this will guarantee uniqueness of solutions to 
all the distributional fixed-point equalities considered in the sequel. 

We now define the distributions that will appear as limits in Theorem 12.21 in terms 
of (unique) solutions to fixed-point equations. In each case, U denotes a uniform random 
variable on (0, 1), independent of the other random variables on the right hand side of the 
distributional equality. The fixed-point equations Q-dJ are all of the form of (O, and 
hence define unique solutions. 

We define Ji by the distributional fixed-point equation 

Ji = mm{U, l-U} + UJ[^^ + (1 - [/) jf ^ + ^ log [/ + log(l - U). (9) 

We shall see later (Proposition ^3)) that E[Ji\ = 0. For a > 1/2, a 7^ 1, define Ja by 

J„ ^ [/" + (1 - i7)" + niin{C/", (1 - [/)"} + ^-^ (C/° + (1 - UT - 1) • (10) 

a — 1 

Define the random variable Hi by 

Hi ^ UJi + (1 - U)Hi + ^ + I log + log(l - [/), (11) 
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where Ji has the distribution given by @, and is independent of the Hi on the right. We 
shall see later fTheorem 14. 1() that E[Hi] = 0. We give the first three moments of Ji and 
Hi in Table 121 later in this paper. For a > 1/2, a ^ 1, define Ha by 

Ha = U^Ja + (1 - UrHa + ( I + ^) + ((1 " U)^ " 1) f- + -^^) , (12) 

where Jq has the distribution given by (|1U() and is independent of the Ha on the right. 
We shall see later that, for a > 1, the Ja and Ha defined in (|T?1|) and ([T2|) arise as centred 
versions of the random variables Ja and Ha, respectively, satisfying the slightly simpler 
fixed-point equations and ([TUl below, so that E[Ja] = E[Ha] = 0; see Proposition 
14.51 For a > 1, we have 

Ja = C/"4'^ + (1 - ^7)"42i + min{C/", (1 - i7)"}. (13) 
Also for a > 1, we have 

Ha = U'' + U'^Ja + (1 - U)''Ha, (14) 

where Ja has distribution given by (|13|1 and is independent of the Ha on the right. The 
expectations of Ja and Ha are given in Proposition 14.51 Note that the uniqueness of the 
Ja and Ha implies the uniqueness of Ja and Ha also. 

Theorem 12.21 gives our main results for the ONG(Z^„) in one dimension. Theorem 
12.21 will follow as a corollary to Theorem 14.11 which we present later. Let 0'^'"(Z^„) := 
0'^'°'{Un) - Ep'^^'^iUn)] be the centred total weight of the ONG on Un- For ease of 
notation, we define the following random variables. As before, U is uniform on (0, 1) and 
independent of the other variables on the right. For 1/2 < a < 1, let 

6. £ VHP HI - UrHP^{u' + (1 - Ur - - (15) 

where Hi^\Hi^^ are independent with distribution given by 1)12(1 . Also let 

Gi ^ Uh\'^ + (1 - U)hI'^ + I log [/ + log(l -U) + ^, (16) 

where H^^\ H^^^ are independent with distribution given by Ullj). Now we state our 
convergence in distribution results. We prove Theorem 12.21 in Section 0] 

Theorem 2.2 (i) For 1/2 < a < 1, we have that, as n ^ oo, 

dh»(l(^)^Ga, (17) 

where Ga has distribution given by HI 5}). and E[Ga] = 0. 
(a) For a = 1, we have that, as n ^ oo, 

0''\Un) - 1 (7 + logn) + ^ ^ Gi, (18) 

where Gi has distribution given by U6}) . Also, E[Gi] = 0, Var[Gi] = (19 + 4 log 2 — 
27r2)/48 « 0.042362, and E[Gl] « 0.00444287. 



6 



(in) For a > 1, the distribution of the limit W{l,a) of ^ is given by 




Remarks, (a) In Theorem 3.6 of ^3], a CLT for 0'^'"'{Un) is obtained for the case 
< a < (i/4. In the context of Theorem 12.11 the result of ^1] imphes that, provided 
< a < d/4, as n ^ cx), n("/'^)~(^/^)(!)'''"(Z^„) is asymptoticahy normal. In [Hj, it is 
remarked that it should be possible to extend the result to the case d/4 < a < d/2 and 
perhaps a = d/2 also. We hope to address this in future work; in particular, the case 
d = 1 should be amenable to solution via the divide-and-conquer approach of this paper. 

(b) A closely related 'directed' version of the one-dimensional ONG is the 'directed 
linear tree' (DLT) introduced in in which each point is joined to its nearest-neighbour 
to the left amongst those points preceding it in the sequence, if such points exist. In jl5j . 
results for the DLT with a > 1 analogous to parts (ii) and (iii) of Theorem 12.21 were 
given. Following the methods of the present paper, one can obtain results for the DLT 
with 1/2 < a < 1 analogous to part (i) of Theorem 12.21 

(c) Of interest is the limit behaviour of O'^^'^iUn) (i.e. when a = d). When d = 1, 
we have that 0^'^{Un) — E[0^'^{Un)\ converges in distribution to a non- normal limiting 
random variable (see Theorem 12.21 (ii)). It would be interesting to determine whether 
0'^''^{Un) — E[0'^''^{iUn)] converges in distribution to a nondegenerate random variable for 
general d = 2, 3, 4, . . ., and whether or not this distribution is normal. 

(d) With some more detailed calculations (given in ^5)> can replace the error 
term o(logn) in by 0(1) (see the remark in Section in}. 

(e) Figure El is a plot of the estimated probability density function of Gi given by (fT^ . 
This was obtained by performing 10^ repeated simulations of the ONG on a sequence 
of 10^ uniform (simulated) random points on (0,1). For each simulation, the expected 
value of 0^'^{Uiq3) was subtracted from the total length of the simulated ONG to give 
an approximate realization of the distributional limit. The density function was then 
estimated from the sample of 10^ realizations. The simulated sample from which the 
density estimate was taken had sample mean ~ 3 x 10~^ and sample variance ~ 0.0425, 
which are reasonably close to the expectation and variance of Gi. 

2.2 The nearest-neighbour (directed) graph 

Our next result gives exact expressions for the expectation and variance of the total weight 
of the the nearest-neighbour (directed) graph on n independent uniform random points 
in the unit interval. The nearest-neighbour (directed) graph on a point set X places a 
directed edge from each vertex to its nearest-neighbour (in the Euclidean sense). 

Let C\'°'{X) denote the total weight, with weight function Wa given by of the 
nearest-neighbour (directed) graph on vertex set X C (0,1). We use this notation to be 
consistent with j25j . which presents explicit laws of large numbers for nearest-neighbour 
graphs including this one. Let denote the binomial point process consisting of n 
independent uniform random points in the unit interval. In this section with give explicit 
results for the expectation and variance of C\'°'{l/(n)- 

Let 2-^i(', •; •; •) denote the Gauss hypergeometric function (see e.g. Chapter 15 of P) 
defined for |z| < 1 and c 7^ 0, — 1, —2, ... by 




(19) 
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Figure 3: Estimated probability density function for Gi. 

where (a)j is Pochhammer's symbol (a)j := r(a + i)/T{a). For n G {2, 3, . . .}, a > 0, set 

1 r(n + l)r(2 + 2a) 



Jn,a ■ — 6 

Also, for a > 0, set 



ja := 8 lim {n^'^Jn, 



(l + a)F(n + l + 2a) 

„„_iF(2 + 2a) 



Fi(-a, 1 + a;2 + a; 1/3). 
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1 + a 



(20) 



2Fi(-Q,l + Q;2 + a;l/3). (21) 



Theorem 2.3 Suppose a > 0. For n G {2, 3, 4, . . .} we have 

E[C\''{Un)\ = ((n - 2)2-" + 2) ^^^^^^^^^"!/^ ~ 2-"r(a + l)n^-^ 



r(n + a + 1) 



as n —> oo. Also, for n £ {4, 5, 6, . . .} 



Var[£}'°(Z^„ 



l-2a„ 



(22) 



n 



= r(r+2t + l) [r(2« + l)(2-2.3-'°+4-°n + 2.3- 

+r(a + 1)2(4 + 12 • 4-° - 12 • 2"" + 22-°n - 7 • 4-°n + 4-"n2)] 
-(F;[£}'"(^„)])' + 8(n-3)J„,„, (23) 

where E[C\'°' (Un)] is given by \2^) and Jn^a is given by \2U\) . Further, for a > 

n2"-iVar[£j'"(Z^„)] ^ (4"" + 2 • 3-^-2")r(2a + 1) - 4-"(3 + a^)T{a + if + (24) 
as n ^ oo, where ja is given by \21\) . 
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Using with (|2Uj) . one obtains, for instance 

+ 17n + 12 



Var[£}'i(W„)] 



12(n + l)2(n + 2) 6 



and 



Var[£}'2(^/„) 



85n^ + 3645n2 + 7154n - 456 



85 



n-3 + 0(n-^). 



108(n+ l)2(n + 2)2(n + 3)(7i + 4) 108 

Also, the limiting constants ja can be evaluated explicitly, so that one can obtain values 
for Va '■— linin^co 

(n2°-iVar[/:}'"(i^„)]). Table □ below gives some values of Va- We 
prove Theorem 12. HI in Sectional One can obtain analogous explicit results in the case of 



a 


1 
2 


1 


2 


3 


4 




1 + y/2 arcsin 




- ^ « 0.094148 


1 

6 


85 
108 


149 
18 


135793 
972 



Table 1: Some values of Vn- 



^i''^{'Pn), where Vn is a homogeneous Poisson point process of intensity n on (0, 1): see 
j24j . where a "Poissonized" version of (|24() is given. 

The remainder of the present paper is organized as follows. Our results on the ONG in 
general dimensions fTheorem l2.1l fii) and (iii)) are proved in Section |21 The main body of 
this paper, Section HI is devoted to the ONG in one dimension and the proof of Theorem 
12.21 In Section [5] we prove Theorem 12.31 Finally, in the Appendix, we give the proofs of 
some technical lemmas which would otherwise interrupt the flow of the paper. 

3 Proof of Theorem 12.11 (ii) and (iii) 

Suppose d G N. For i G N, let Zi{d) := O'^'^ {U^) - O'^^^ {Ui^i) , setting 0'^'^{Uq) := 0. That 
is, Zi[d) is the gain in length of the ONG on a sequence of independent uniform random 
points in (0,1)'^ on the addition of the ith. point. Let di{^\X) denote the (Euclidean) 
distance between x G R'^ and its nearest-neighbour in the point set X C R'^. 



Lemma 3.1 For a > and d G N, as n — > oo, 

E[{Zn{d)r] = 0{n-^'''). 

Proof. We have 

=i?[(di(Ui;Z^„))"] =n-/'^i?[(di(nV'^Ui;nV'^Z^„))"], 
which is 0{n~°^/'^) (see the proof of Lemma 3.3 in [ISl)- D 
Remark. We can obtain, by some more detailed analysis, (see |24j ) 



(25) 



E[{Zn{d)Y 



(n7;rf)-"/'^r(a/d) +o(n^("/'^)). 



Proof of Theorem 12.11 (ii) and (iii). With the definition of Zi[d) in this section, let 

oo 

W{d,a)=Y,{Zi{d)r- 



i=l 
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The sum converges almost surely since it has non- negative terms and, by (|25jl . has finite 
expectation for a > d. Let /c G N. By H25|) and Holder's inequality, there exists a constant 
C G (0, oo) such that 

oo oo oo 

n=i«2=i «fc=i 

oo oo oo 

n=l«2=l ifc=l 

since a/d > 1. The convergence then follows from the dominated convergence theorem, 
and we have part (ii) of Theorem 12.11 

Finally, for (iii) of Theorem 12. 11 we have, when a = d 

by the proof of Lemma 3.2 of ^7]. Since the sequence {di{n^/'^Vi;n^/'^Un))'^ is uniformly 
integrable (see the proof of Theorem 2.4 of ^7j) have 

E[n{Zn{d)f]^E[{d^{0-ni))'']=v^\ 

where the last inequality follows by a simple computation, or by equation (2.7) of j25j . 

So E[{Zn{d))'^] = n~^{v^^ + h{n)) where h{n) — > as n ^ oo. Thus 

n n 

E Y,iZr{d)f = + K^) = log n + o(log n), 

i=l i=l 

and so we have © , completing the proof of Theorem 12.11 □ 



4 The ONG in = 1 
4.1 Notation and results 

In this section we analyse the ONG in the interval (0, 1). Theorem 12.21 will follow from 
the main result of this section. Theorem 14 . 1 1 b elow . We introduce our notation. 

For any finite sequence of points %i = (xi, X2, . . . , 2;n) G [0,1]" with distinct inter- 
point distances, we construct the ONG as follows. Insert the points xi,X2., ■ ■ ■ into [0, 1] 
in order, one at a time. We join a new point by an edge to its nearest neighbour among 
those already present, provided that such a point exists. In other words, for each point 
Xi, i > 2, we join Xi by an edge to the point of {xj : 1 < j < i} that minimizes \xi — Xj\. 
In this way we construct a tree rooted at xi, which we denote by 0NG(7^). Denote the 
total weight (under weight function Wa given by Q, a > 0) of 0NG(7^) by 0^'°'(Tn), to 
be consistent with our previous notation. 

For what follows, our main interest is the case in which 7^ is a random vector in 
[0, 1]". In this case, set (!)^'"(T„) := ©^'"(T^) - Ep'^''' {%,)], the centred total weight of 
the ONG on 7^. Let {Ui, U2, U3, . . .) be a sequence of independent uniformly distributed 
random variables in (0, 1), and for n € N set Un ■= {Ui, U2, ■ ■ ■ , Un)- Given Un, we define 
the augmented sequences Z^o ^ (0, C/i, . . . , Un) and Un' = (0, 1, f/i, . . . , Un). Notice that 
ONG(W°'^) and ONG(W°) both give a tree rooted at 0, and that in ONG(Z^n'^) the first 
edge is from 1 to 0. 

We now state the main result of this section, from which Theorem 12.21 will follow. 
The convergence of joint distribution results in 1)26(1 and (|27() are given in more detail, 
complete with joint distribution fixed-point representation, in Propositions 14.31 and 14.41 
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Theorem 4.1 (i) For 1/2 < a < 1, we have that, as n ^ oo, 

^ (J„,#„,G„), (26) 

where Ja, Ha, Ga jointly distributed random variables with marginal distribu- 
tions given by MU\) . il^) . I115\) respectively. 

(a) For a = 1, we have that, as n ^ oo, 

{d'^Hu^^'),d'^\K^),d'^HUn)) ^ iJi,Hi,Gi), (27) 

where Ji, Hi, Gi are jointly distributed random variables with marginal distributions 
given by respectively. The first three moments of Ji, Hi and Gi 

are given in Table \^ Further, the variables on the right hand side of i27[ ) satisfy 
Cov(Ji,i?i) = ((9 + 61og2)/32)-(7rV24) w -1.84204 x 10-^ Cov{Gi,Hi) = ((35 + 
101og2)/48)-(7rV24) 0.0255536, and Cov{Gi, Ji) = ((7+41og 2)/24)-(7rV24) ^ 
-4.04232 X 10"^ 

(Hi) For a > 1, we have that, as n ^ oo, 

Oi'"(Z^o.i) ^ 1 + J„; Oi'"(Z^0) ^ H^, 

where the convergence is almost sure and in L^, p E N, and the distributions of Ja 
and Ha are given by Uci\) and ^14^ respectively. 





E[] 


Var[-] 


Em 


Jl 





((1 + 


log2)/4)- 


(7rV24) ^ 0.012053 


^ -0.00005733 


Hi 





((3 + 


log2)/8)- 


(7rV24) ^ 0.050410 


^ 0.00323456 


Gi 





((19 


+ 41og2)/48 


) - (7rV24) ^ 0.042362 


^ 0.00444287 



Table 2: First three moments for the random variables Ji, Hi, Gi. 



Our method for establishing convergence in distribution results is based on the re- 
cursive nature of the ONG. Essential is its self- similarity (scaling property). In terms 
of the total weight, this says that for any t £ (0, 1), ii Vi, . . . ,Vn are independent and 
uniformly distributed on (0,t), then the distribution of 0^'°'{Vi, . . . ,Vn) is the same as 
that of t"C'i'"(C/i,...,C/„). 

Write U = Ui for the position of the first arrival. For ease of notation, denote 

Yn := Oi'"(W0'i) - 1, (28) 

where by subtracting 1 we discount the length of the edge from 1 to 0. Then using the 
self-similarity of the ONG, and conditioning on the first arrival, we have the following 
relations: 

Oi'-(Z^„) ^ C/"Oji^(ZY0 („)) + (1 - ^)°Of2}(^n-i~^(n)), (29) 
(^i,.(^0) ^ u-olf^iU'/^^^) + (1 - C/)"Oj2}K^i-^(n)), (30) 
^ (min{C/, 1 - U}r + f/^^H + (1 " f^)"^i-L^(n)' (31) 

where, given U, N{n) ~ Bin(n — 1, U) gives the number of points of U2, U3, . . . ,Un that 
arrive to the left of Ui = U. Given U and N{n), 0]'^{-) and 0]'^{-) are independent 
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copies of Also, given U and N{n), and ^„^^x-Af(n) independent with the 

distribution of 5jv(n) ^-iid y„_i_jv(n)) respectively. 

For a > 1, we prove almost sure and {p G N) convergence of ©^'"(Z^,^) and 
in the same way as in the proof of Theorem 12.11 (ii), and thereby obtain 
the corresponding result for 0^'°^{Un)- The relations (|29|) . ^,\()\ and (|H1|) will then enable 
us to prove the desired results for a > 1. 

For l/2<a<l,we use a result of Neininger and Riischendorf on limit theorems 
for 'divide and conquer' recurrences. However, we cannot apply this directly to H29|) to 
obtain the convergence of 0^'°^{Un), since (|29|) is not of the required form; the variables 
on the right are not of the same type as the variable on the left. On the other hand, we 
see that (|31|l is of the desired form. This will be the basis of our analysis for 1/2 < a < 1. 

Indeed, by considering a vector defined in terms of all three of 0^'°'{U^), and 

we obtain the recurrence relation 1)6 7|) below. We can then apply the result of 
dH. This is why we need to consider ©^'"(^^O) and 0^'"(Z^°'^) in addition to ©^'"(Z^n). 

The outline of the remainder of this section is as follows. In Section 14.21 below, we 
give a discussion of the theory of spacings, which will be very useful in the sequel. In 
Section 14.31 we begin our analysis of the ONG with some preliminary results, based on 
the discussion in Section [4.21 Then, in Sections 14. 4( 14.51 and 14.61 we give results on C'^'"(-) 
when l/2<a<l,a=l, and a > 1 respectively. Finally, in Section W7l\ we give a proof 
of Theorems 14.11 and 12.21 



4.2 Spacings 

The one-dimensional models considered in this paper (the ONG and the standard nearest- 
neighbour graph) are defined in terms of the spacings of points in the unit interval. Thus 
the theory of so-called Dirichlet spacings will be useful. For some general references on 
spacings, see for example ^2]- A large number of statistical tests are based on spacings, 
see e.g. |7j for a few examples. 

Recall that lAn denotes the binomial point process consisting of n independent uni- 
form random variables on (0,1), Ui,U2, ■ ■ ■ ,Un- Given {Ui, . . . ,Un} ^ (0,1), denote 
the order statistics of Ui, . . . , Un, taken in increasing order, as U'^iy ^(2)' • • • ' ^(n)' Thus 
(C/^"^, . . . , U^n)) ^ nondecreasing sequence, forming a permutation of the original {Ui, . . . , Un)- 

The points Ui, . . . ,Un divide [0, 1] into n + 1 intervals. Denote the intervals between 
points by := {Uf^j_iyU^j'^) for j = 1, 2, . . . , n+1, where we set U^q^ := and U^^^-^^ := 1. 
Let the widths of these intervals (the spacings) be 

^. ._ \2. I _ - 

for j = 1, 2, . . . , n -|- 1. For n G N, let A„ C R" denote the n-dimensional simplex, that is 
A„ := |(3;i,... ,x„) G R" : > 0, 1 < i < n; ^2;i<l|. 

By the definition of 5", we have that S*" > for j = 1, . . . ,?i -|- 1 and X]j=i 'S'" = 1- 
So we see that the vector (S", S2, ■ ■ ■ , SJ^^i) is completely specified by any n of its n -|- 1 
components, and any such n-vector belongs to the simplex A„. It is not hard to show that 
any such n-vector is, in fact, uniformly distributed over the simplex. Hence {Sf, . . . , S^) 
is uniform over the simplex A„, and S^_^_l = 1 — Y17=i 

Thus (5f , ^2 , . . . , S^+i) has the symmetric Dirichlet distribution with parameter 1 
(see, e.g., 0, p. 246), and any n-vector of the has the Dirichlet density 

f{xi,...,Xn) = n\, (xi, . . . ,x„) G A„. (32) 
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In particular, the spacings 5", j = 1, . . . ,n + 1 are exchangeable - the distribution of 
{Si, 5*2 , ... , SJ^_^_i) is invariant under any permutation of its components. 

By integrating out over the simplex, from (|H2j) one can readily obtain the marginal 
distributions for the spacings. Thus, for n > 1, a single spacing has density 

fixi)=n{l-xir-\ 0<xi<l, (33) 

while for n > 2, any two spacings have joint density 

/(xi, xa) =n(n- 1)(1 -xi -X2)""^ (xi, xa) G Aa, (34) 

and for n > 3 any three spacings have joint density 

/(xi,X2,a;3) = n(n - l)(n - 2)(1 - xi - X2 - xa)'""^, (xi, xa, X3) G A3. (35) 

Using the fact that (see, e.g., 6.2.1 in P) 

'<''-'(l-0'-'dt=«<a, (36) 

r(a + b) 

for a > 0, 6 > 0, it then follows from that, for /3 > 0, n > 1 



E 

and from ^ that for /? > 0, n > 2 



_ r(n + i)r(/? + i) 

^^'^ \ - r(n + /3 + l) ' ^-^^^ 



E 



r(n + i)r(/3 + i) 



2 



(5r)^(S^)^ = (38) 
^ ^ r(n + 2/3 + 1) ^ ^ 



When considering our nearest-neighbour graphs, we will encounter the minimum of two 
(or more) spacings. The following results will also be needed in Sectional 



Lemma 4.1 For n>l, 



For n >2, 



Finally, for n > 3 



and 



minis'^, (39) 



{S^, mm{S^, S^}) ^ {S^, S^/2). (40) 



(min{Si", 5a"}, min{53", 54"}) ^ (5172, 31^/2), (41) 



min{5r,52",53"}^5r/3. (42) 



Proof. We give the proof of ()39|) . The other results follow by very similar calculations 
based on and Suppose n>2. From (|5H) . we have, for < r < 1/2 

Pfminj^J', S^} > r] = > r, 5J > r] 

= n(?i - 1) / dxi / (1 - xi - X2)"~^dx2 



t'2 

= {I - 2rY = P[Sl > 2rl 

and so we have ()39|) . □ 
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4.3 Preparatory results 



We now return to the ONG. We make use of the discussion of spacings in Section 14.21 
For n G N let iJ„ and r„ denote the random variables given by the gain in length, 
on the addition of the point of the ONG on Un-i, l^n-i ^"^^ ^n-i respectively. That 
is, with the convention O^'^Z^o) = O^'H^o) = and 0^''^{Uq'^) = 1, for n G N set 

Zr,:=0''\Un)-0''\Un.l), (43) 

r„:=Oi'i(Z^r)-Oi'H^n-\)- 

Thus, for example, in the ONG{l/(n^) with weight function Wa as given by the nth 
edge to be added has weight T". 

We will make use of the following discussion for the proof of Lemma 14.21 below. For 
a > 0, with the definitions at we have that 



Oi'-(Z^O) - = -l + Y,{Hr-Tn, and (44) 

1=1 

n n 

0^'"(Z^„) - Oi'"(Z^°) = ^(Zf-/7r) = -/?? + ^(Zr-i7f), (45) 

i=l 1=2 

since Zi = 0. Consider the arrival of the point Un- For any n, T„ and Hn are the same 
unless the point Un falls in the right hand half of the rightmost interval of width 

S^~^. Denote this latter event by E^- Given S^~^, the probability of -E^ is S^~^ /2. Given 
S^~^, and given that En occurs, the value of T„ is given by (1 — Vn)Sn~^ /2 and the value 
of Hn by (1 + Vn)Sli-^/2, where K = 1 + 2{Un - 1)/S^"^ is uniform on (0, 1) given En- 
So we have that, for ?i G N, given Sn~^ 

((1 + K)"-(1-K)"), (46) 

where En is an event with probability S"~^/2. A similar argument (based this time on 
the leftmost spacing) yields that, for n > 2 

(cin— 1 \ ^ 
-^j {{l + Wnr-{l-Wnr), (47) 

where Fn is an event with probability S'"^^/2 and, given F„, Wn is uniform on (0, 1). 

We will need the following asymptotic expansion, which follows from Stirling's formula 
(see e.g. 6.1.37 in P). For any /? > 0, as n — > oo, 

+ - n-^ - i/3(/3 + l)n-^-i + 0{n-^-^). (48) 



r(n + 1 + /3) 2 



Lemma 4.2 For a > and n > 2, we have that 

ii;[oi'"(z^°) - oi'"(z^r )] = + (2- -i) ^^"^^^" + \^ 

a i (n + 1 + aj 

= ^ + 0(?i-"), (49) 

a 
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and 



+ 0(n-"). (50) 



a(l + Q 

Proof. Suppose a > 0. From (|46p we have that for ?i G N 
So by (|37|1 we have that 



1 + a 



^[^a _ j^a] = (1 - 2^")r(i + «)r(n) 



r(7i + 1 + a) 
Thus, from (jlU, 

i=l i=l 

the last equahty following by induction on n. This then gives 1)49^ . with the asymptotic 
expression following by (jlHJ. Similarly, from (|47() 

P.r7« (l-2-")r(l + a)r(n) 
^[^'^ " ^'^l r(n + l + a) ' 

for n > 2, while E[H^] = E[U^] = {a + 1)"^ and Zi = 0. With ^ follows. □ 

Lemma 4.3 (i) For n G N, T„ as defined at has distribution function Fn given by 
F„(t) = Ofort<0, Fn{t) = lfort> 1/2, and Fn{t) = l-(l-2t)" /or < i < 1/2. 

(ii) For /? > 0, 

^ _ , r(n+i)r(/?+i) 

r(n + /3 + l) • ^^^^ 



/n particular, 



1 77- 

£:[TJ = — -; VarfTJ = — — -. (52) 

^ ^ 2(n+l)' ^ ^ 4(n+l)2(n + 2) ^ ^ 



(iii) For /? > 0, as n — > cx) 

E[r^] = 2-^r(/3 + l)n"^ + 0(n"^-i). (53) 

('it'J n — > oo, 

2nTn Exp(l), 
where Exp(l) is an exponential random variable with parameter 1. 

Proof. By conditioning on the number of Uj, j < n with Uj < Un, using Lemma [4.11 and 

by exchangeability of the spacings, we have that for n > 1, T„ = min{5f , ^2 } = S'f /2, by 
(jnni)- Then (i) follows by (|33|) . and (ii) follows by (|37|) . Part (iii) then follows from part (ii) 
by l|48|) . For (iv), we have that, for t G [0, oo), and n large enough so that t/[2n) < 1/2, 



P[2nTn >t]= P[Tn > t/{2n)] = (1 - (t/n))" 



-t 



as n — > oo, but 1 — e *, t > is the distribution function of an exponential random 
variable with parameter 1. □ 
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Proposition 4.1 Recall that 7 ~ 0.57721566 is Euler's constant, defined at Suppose 
a > 0. As n ^ 00, we have 



Ep^'^iUli'^)] = 2" + 1 - ^-^ + 0(n'"); (0 < a < 1) (54) 

E[0'^\U'^^')] = ilogn + i(7 + l) + 0(n-^); (55) 

i?[Oi'"(Z^r)] = l + ^ + 0(ni-") (a>l) (56) 

a — 1 

Proof. Counting the first edge from 1 to 0, we have 

n n 

E[0'''^{U'^^')] = l + Yl {e[0'^'^{U^'')] - E[0'''^{ut\)]) = 1 + E ^[^"]- 

i=l i=l 

In the case where a = I, E[Ti\ = (2(z + 1))~^ by ((52), and H55() follows by ©• For general 
a > 0, a / 1, from we have that 

ii;[oi'-(z^r)] = i + 2-°r(i + a)f; 

^ r(i + a + z) 

2-" 2-"r(l + a)r(n + 2) 

= ^ + 7~7 TvF? — TT"^ — ^' 

a — 1 [a — Ijl (n + 1 + aj 

the final equality proved by induction on n. By Stirling's formula, the last term satisfies 

(a- l)r(n+ 1 + a) a-1 ^ v v ; 

which tends to zero as n — > 00 for a > 1, to give us 1)56^ . For a < 1, we have ()54p from 
(|57jl and ((SHI). ^ 

Proposition 4.2 Suppose a > 0. n ^ 00, we have 

^[Oi'"(W°)] = ^(" + ^) 2-"n^-" + -- ,^", +0(n-"); (0 < a < 1) (59) 
1 — Q a a(l — a) 

E[0''\U^)] = llogn + ^^ + 0{n-'); (60) 
E[0''"K)] = 1 + + 0(ni-") (a>l) (61) 

Proof. This follows from Proposition 14.11 with ()49() . □ 

4.4 Limit theory when 1/2 < a < 1 

Let U be uniform on (0, 1), and given U, let N{n) ~ Bin(n — 1, U). Set 

B„(n) (n - 1) (^^ (^) -% „ _ ur {^^^) - l) ^ 

Lemma 4.4 Suppose < a < 1. Then, as n ^ oo. 

Bain) 0. (63) 



■a 
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We defer the proof of this lemma to the Appendix. Note that for what follows in this paper 
we will only use convergence in ()63|). However, the stronger version requires little 
extra work, and we will require the version in future work dealing with the a E (0, 1/2] 
case. 



Proposition 4.3 Suppose 1/2 < a < 1. Then as n 
where {Ja,R,S) satisfies the fixed-point equation 

Jot 

si'} 



oo, 



V 




(64) 



J a \ 


1 ^ 1 


f c/° 








R 











s ) 




{0 








+ 



V 



(i-uy 







/ j{2} 



/ min{;7, !-[/}" + ^{{1 - + - 1) 



+ 



iu--{i-ur)i{u>i/2} + iii-ur 



1) 



1-2- 



(65) 



In particular, Ja satisfies the fixed-point equation mU\) . Also, E[Ja] = E[R] = E[S] = 0. 

Proof. We make use of Theorem 4.1 of |llj . which is a general result for 'divide-and- 
conquer' type recurrences. Recall the definition of y„ at Let 



Rn := 0''"(Z^°) - Oi'"(Z^0'i) + 1, Sn := 0^'"(Z^n) - 0''"(Z^°). 



(66) 



Write U = Ui for the position of the first arrival. Given U, let N{n) ~ Bin(n — 1,U) 
be the number of points of U2, U3, . . . ,Un that arrive to the left of Ui = U. Using the 
self-similarity of the ONG, we have that {Yn, Rn, Sn) satisfies, for a > 0, 

^N{n) 











V Sn j 






[I - uy 



+ 




/ y{2} 
r{2} 

\ '^n-l-N{n) I 



min{C/, 1 - UY 
+ I ([/" - (1 - C/)")l 



{C/>l/2} 



(67) 



where, given U and N{n), Y^^\, Y 



.{2} 



N{n)^ ^n-l-N{n) 



are independent copies of yAf{n)5 ^n- 



1-N(n) 



respectively, and similarly for the Rs and Ss. This equation is of the form of (21) in |llj . 
Suppose 1/2 < a < 1. We now renormalise H67() by taking 



{Yn, Rn, Sn) '■— {Yn — E[Yn], Rn — E[Rn], Sn — E[Sn]), 

SO in the notation of we take C„ = 1. That is, 

Yn = d''"{U^''), Rn = d''^{U^)-6''^{U^''), Sn = d''"{Un)-d''^{U^). (69) 

Also set. 



YNin) ■■= y^(„)-^[lV(„)|iV(n)] 



Y 



n-l-N{n) 



Y 



n-l-N{n) 



E[Yn- 



1-N{n) 



\N{n)] 



(70) 
(71) 
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and similarly for the Rs and Ss. Using the expressions for the expectations at (|54|) . (|49|) 
and (jSUf) . from ()67() we obtain 




WO 

U" 



+ 



(1 - ur 












[Y-uy 









/ y{2} 

o{2} 

n-l-iV(n) 




(72) 



where 

Bn 
Cn 



I min{[/, 1 - ?7}° + C7„(?i - 1)(V2)-q_b^(j^) + 1^ (^a + (i _ j/)^ - 1) \ 

- (1 - ur)i{u>i/2} + ^((1 - f/)" - 1) 



+U"h{N{n)) + (1 - - 1 - N{n)) - h{n) 

+ 1 {1-U)°'k{n-1- N{n))-k{n) 
VkiNin)) - l{n) 

where Ba{n) is as defined at 1)62(1 . /i(n), A;(n), ^(n) are all o(l) as n — > oo and Cq is a 
constant. 

In order to apply Theorem 4.1 of JJ, we need to verify the conditions (24), (25) 
and (26) there. By Lemma lOl (n - l)(^/2)-o^^(-^) ^.gj^^g 

zero in L as n — > oo, for 

1/2 < a < 1. Thus, for condition (24) in ^Jj) as n — > oo, 



min{C/, 1 - f/}" + 14 + (1 - - 1) 

- (1 - c/)")i|c/>i/2} + ^((1 - ur - 1) 



(73) 



1 



Also, writing 



E 



lop 



for the operator norm, for condition (25) in 



JJO. 


























+ 



op 




op. 



2a + 1 



< 1, 



(74) 



oo 



for Q > 1/2. Finally, for condition (26) in jllj . for a > and any ^ G N, as n 

E[^{N{n)<e}yj{N{n)=n}U'^°'] ^0; [l{n-l-Ar(n)<^}U{n-l-iV(n)=n}(l - f^)^"] ^0. (75) 

Taking s = 2 and C„ to be the identity matrix, Theorem 4.1 of applied to equa- 
tion (f7^ . with the conditions (jTlJ), ((75|) and ((75)) . implies that (!"„, Sn) converges in 
Zolotarev (^2 metric (which implies convergence in distribution; see e.g. Chapter 14 of 
[201) to {Y,R,S), where E[Y] = E[R] = E[S] = and the distribution of {Y,R,S) is 
characterized by the fixed-point equation 



Y 

it 

s 



V 




{l-UY 
+ 10 (1 - [/)" 






( min{[/, 1-UY + ((1 - J7)° + - 1)^:^ 

([/« _ (1 _ur)i{u>i/2} + ((1 - ur - 1) 



1-2- 



(76) 



18 



That is, Y satisfies (|10j) . so that Y has the distribution of J^, and setting y = in (|76|) 
gives (inSI- Then ^ follows bv □ 



4.5 Limit theory when a=l 

Proposition 14.41 below is our main convergence result when a = 1. First, we need the 
following result, the proof of which we defer to the Appendix. For x > 0, set log"^ x := 
maxjlog X, 0}. 

Lemma 4.5 Let U be uniform on (0, 1) and, given U, let N{n) ~ Bin(n — 1,U). Then, 
as n —> oo, 



[/(log"*" N{n) — logn) 
(1 - [/)(log+(n - 1 - N{n)) - logn) 

Proposition 4.4 As n 



L2 



UlogU; 

(1 - U) log(l 



U). 



oo, 



where {Ji,R,S) satisfies the fixed-point equation 



V 




(77) 
(78) 



(79) 





1-U 
1-U 





I log + log(l -U)+ mm{U, 1 - U} 

+ \ {2U-l)l{u>i/2}-^ 
i _ £ 

4 2 




(80) 



In particular, Ji satisfies the fixed-point equation 
Yar[R] = 1/16, Var[5] = 1/24, and 



Var[Ji] = i(l + log2)-^ 



Also, E[Ji] = E[R] = E[S] = 0, 

(81) 



0.012053 



an. 



d E[jf] 



-0.00005732546. 



Proof. We follow the proof of Proposition 14.31 Recall the definition of 1^ at H28() . Again 
define i?„ and Sn as at H66() . this time with a = 1. Then we have that the a = 1 case of 
(|67|) holds. We now renormalise (|67|). with the notation of (|68|) and (|70|) . By (|55|) we have 



E[y„] = i?[O^'i(^0'i)] - 1 = i logn + i(7 - 1) + h{n), 



where h{n) = o(l), while by the a = 1 case of ()49|) £'[i?n] = (1/2) + fc(?T-), where /c(n) = 
O(n-i), and by a = 1 case of ^ E[Sn] = -(1/4) +i{n), where ^(n) = 0(n-^). Then 
by (jnZI) 





















:) 


^N{n) 


SnJ 











l-?7 0^ 
+ 1 1-U 

0, 



/y{2} 

' ^n-l-N(n) 




(82) 
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where 




Uh{N{n)) + (1 - U)h{n - 1 - N{n)) - h{n) 
(1 - U)k{n - 1 - N{n)) - k{n) 
Uk{N{n)) -£{n) 

mm{U, 1-U} + f (log+ N{n) - logn) + i^(log+(n - 1 - iV(n)) - logn) 
+ ( {2U-l)l^u>i/2}-^ 

4 ~ "2 

The conditions of Theorem 4.1 of are satisfied, by (|7.Sj) . (|75|) and Lemma 14.51 Taking 
s = 2 and C„ to be the identity, Theorem 4.1 of apphed to equation (|82|) shows 
that (Yn, Rn, Sn) converges in Zolotarev ^2 metric and hence in distribution to {Y,R,S), 
where E[Y] = E[R] = E[S] =0 and the distribution of {Y,R,S) is characterized by the 
fixed-point equation 



V 




U 
1 




^logU + ^ log(l -U) + mm{U, 1 - 

l)l{(7>i/2} - ^ I ■ (83) 





That is, Y satisfies ©, so that Y has the distribution of Ji, and setting y = Ji in H83|) 
gives (jHUj) . By the a = 1 case of we then have (|7?l|). 

It remains to prove the results for the higher moments of Ji. For the variance of Ji, 
squaring both sides of ©, taking expectations, and using independence and the fact that 
E[Ji] = 0, we obtain 

E[jf] = '^E[jf]+E[mm{U,l-U}^] + ^E[U\\ogUf] 

+^E[U{1 - U) logC/log(l - [/)] + 2E[C/logC/min{C/,l - U}]. 

The integrals required for the expectations are standard, and we find that = ((1 + 

log2)/4) — (7r^/24), which yields (jHIJ. Similarly, we obtain the third moment E[Jl] = 
—0.00005732546... from Q, although in this case numerical methods are required for 
some of the integrals. □ 



4.6 Limit theory for a > 1 

Proposition 4.5 Let a > 1. 

(i) There exists a r.v. Ja such that as n ^ 00 0^'°'{IJn^) 1 + Ja cl.s. and in L^, 
p G'N. Also, Ja satisfies the fixed-point equality Ucl\) . and E[Ja] = 2~°^/{a — 1). 

(ii) There exists a r.v. Ha such that as n ^ 00 0^'°^{U^) Ha a.s. and in L^. Also, 
Ha satisfies the fixed-point equality and E[Ha\ = (1/a) + 2~"/(a(Q — 1)). 

Proof. First we prove part (i). Let Tj be the length of the ith edge of the ONG on Un^ , 
as defined at Let J a '■= Yli^i'^f- "^^^ converges almost surely since it has 

non- negative terms and, by (|5()|) . has finite expectation for a > 1. By a similar argument 
as the Proof of Theorem 12.11 (ii) in Section |21 the convergence follows by Holder's 
inequality and dominated convergence. 
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We now identify the limit. We have this time for a > 1. As n — > oo, N{n) and 
n — N{n) both tend to infinity almost surely, and so, by taking n — > oo in (|31() . we obtain 
the fixed-point equation (|TT?|) . 

The identity i?[Ja] = 2^"(q; — is obtained either from H56|) . or by taking expec- 
tations in (Uni). Next, if we set Ja = Ja - E[Ja], JEl) yields (fTI7|) . 

We now prove part (ii). Following the above argument with the Hi replacing the Tj 
and using (|6T|) in place of (|56|) gives that converges a.s. and in L^, p G N, to 

some random variable. Once more, we need to identify the limit. 

Consider the a > 1 case of (j30|) . As n — > oo, N{n) and n — N{n) both tend to infinity 
almost surely, and so, by taking n — > oo in (jJOJ, and using the fact that 0^''^{U^^f^^^) 
converges almost surely to 1 + Jq (by part (i)), and that converges 
almost surely to Ha (by the argument above) we obtain the fixed-point equation ()14p . 

The identity E[Ha] = a~^ + 2^°'a~^ (a—1) ^ is obtained either from (|6H) . or by taking 
expectations in Next, if we set Ha = Ha - E[Ha], (CH) yields □ 



4.7 Proof of Theorems SH] and EH 

Proof of Theorem 14.11 First we prove part (i) of the theorem. For 1/2 < a < 1 we 

have that 




(!)i'"(Z^O) _ oi'"(Z^°'i) 
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(84) 



as n ^ oo, by Proposition 14.31 By H65|) . the final term in (|84() is equal in distribution to 
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Multiplying out and using the fact that ([/" 
we obtain 



0^'"(Z^°'') 
Oi'°(Z^0) 

0''"(Z^n) 




{1} 



+ (1 - C/)^ 



min{[7, 1 - 



+ 



/ mm{U, 1 - + ((1 - f/)" + C/° - l)fTT \ 



\ (C/° + (1 - C/)^ 



2 

1+a 



0(1—0) • 



So setting i^Q, = Ja + R and Gq = Jq + i? + 5, we have (|26|) . 

Now we prove part (ii) of the theorem. For q = 1, as an analogoue of (jSU, 
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as n ^ cxD, by Proposition 14.41 By (|8U|) . the final term in (|85jl is equal in distribution to 





/ ^ log [/ + log(l -U)+ mm{U, 1 - U} 
j ^ {2U-l)l{u>i/2}-^ 



\ 4 



U 
2 



Multiplying out and using the fact that {2U — l)l|[/>i/2} = U — min{J7, 1 — U} we have 




( Jl 

31 



{1} 
1 

^{1} 



+ {1-U) 



I J? 



{U/2) log [/ + log(l -U) + min{C/, 1 - U} 
(C//2)log?7 + i^log(l-C/) + ^ 
(C//2)logC/ + i^log(l-C/) + i 



So setting Hi = Ji + R and Gi = Ji + ii + 5, we have (P7|) . Proposition 14.41 gives 
E[Ji] = E[R] = E[S] = 0, and so E[Hi] = E[Gi] = also. Proposition Ol also gives 
Var[Ji]. We obtain the higher moments of Hi and Gi from and ()16() . The stated 
covariances follow from the fixed point equation (|80j) and the moments given in Proposition 

1131 

Finally, part (iii) of the theorem is Proposition 14.51 □ 

Proof of Theorem 12.21 Parts (i) and (ii) of the theorem follow directly from the 
corresponding parts of Theorem 14. 11 It remains to prove part (iii) of the theorem. Suppose 
a > 1. Consider the a > 1 case of ()29() . We use the fact that N{n) and n — N{n) tend 
to infinity almost surely, the independence given U and N{n), and the convergence in 
and almost surely of (!)^'"(W^) (for q > 1) to obtain the result. □ 



5 Proof of Theorem 12.3 

Proof of Theorem 12.31 We make use of the theory of Dirichlet spacings as discussed 
in Section f4. 21 Since the nearest-neighbour (directed) graph joins each vertex (which sits 
at the endpoint of each spacing apart from the points and 1) to its nearest neighbour, 
we have, for n > 3 

n-l 
i=2 

Now, from (|5H|) . using exchangeability we have that 

EiCl'^iUn)] = 2E[iS^r] + (n - 2)E[(min{5r, S^}n 
where, from (|39|) and (|37|) we have 

E[{min{S^,S^}r] = 2-"E[(gi")"] = 2'" + + . (87) 
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Then follows. We now prove (|23() . Squaring both sides of (|86|) and taking expecta- 
tions, we have 



E 

n-l 



2a 



n— 1 i— 1 

+ 2 E E ^ [(min{5r, 5IVi})" (min{S;, ^^J)' 

i=3 i=2 
n-l 



^^[(min{5r,5:VJ) 

i=2 



1=2 



n-l 



+2 J] i5;[(5;^r(min{5r, 5r+ an + 2i?[(52"r(5;; 



j=2 



Then, by exchangeability, 



E 



{n-2)E (min{5^ 5^})^" + 2E[{S'^ S^)''] 



+ (^ - 3)(n - 4)i? [(min{5^ ^2"})" {mm{S^, S^jT] 
+2(n - 3)E [(min{5r, (min{S2^ S^}r] + 2i^[(5n'"] 

+4(n-3)i^[(5r)"(min{52",53"})"]+4E[(Sr)"(min{5r,52"})"]. 

Now, by ((SHI) and (jlOl) we have 



E[{S'^nmm{S^,Snr] = 2- 
and, using H38|) this time with ()41() we obtain 

i?[(min{Si",52"})"(min{53",5r})-] = 2" 
Also we have that 

^n^2a 



T{n + 1 + 2a) 



.2^ r(n + l)r(l + a)2 
r(n + l + 2a) 



i^[(5n"(min{5r,52"})"] = E[{S^ri^s^^ss}] + E[iS?nS^rMs^>ss}] 



^E[{min{S^,S^}f-] + lE[{S^nS^r 



Hence from (|87|) and (|38j) we obtain 



1 



i?[(5r)"(min{5i", 52"})"] = 7J (2-2"r(l + 2a) + r(l + af) 



,2^ r(n + l) 



2 ^ ' ' ■ ' ^ r(n + 1 + 2a) 
The final term on the right hand side of l|88|) that we need to evaluate is 

£;[(min{5r,52"})"(min{S2",53"})-] = EUS'^f^l^s'^^^s-, ss<ss}] 

+4£;[(5")"(52)"l{5n<5j<5j}]. 

For the first term on the right of (|89() . by 1)42^ we have 

£'[(S'2 )^"l{Sj<5n, sj<S|>}] = -£'[(min{5'",S'2,5'3})^"] 

3-i-2a r(l + 2a)r(n + l) 
r(n + l + 2a) 
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Now consider the second term on the right of (|89|) . By a direct computation using (|35jl . 
we have 

^[(5*1 )"(5'2)"l{S'i"<5J<55'}] 

rl/Z r(l-y)/2 rl-x-y 

= n(n-l)(?i-2) / dy dx dzx"y"(l - x - y - z)"-^ 



^0 Jy J X 

1-1/2. [■{l-y)/2 
= n(n-l) / dy I - y - 2x)"-2dx, 

^0 Jy 

which, via the change of variables w = y + 2x and Fubini's theorem is the same as 

rl f-w/3 

n(n- 1)2-"-^/ d'u;(l - / - y)°dy. 

Setting t = 3y/w reduces this to 



JO 



-1 /■! 



n(n- 1)6""- W i(;^+"°(l-u;r"'dw; / - (t/3)rdt. 

Using (|36() for the integral involving zu, and the fact that (see, e.g., 15.3.1 in pP) for a > 0, 

-1 



/ t^-^l - {t/z))-^dt = - 2Fi(6, a; a + 1; z) 
7o a 



for the integral involving t, we obtain the expression for J^^a as given by (|2Up . Then, by 
(|88p and the subsequent calculations, we obtain (|23|) . Finally, (|24jl follows from (|23jl by 
(EHl). □ 

Appendix: technical lemmas 

Proof of Lemma 14. 4L The result is trivial when a = 1 or a = 0. Suppose < a < 1. 
Suppose n > 1. To ease notation, for the duration of this proof, set m = n — 1. Then we 
have that for any U € (0, 1) and < N{n) < m, 

\ m J \ m J 

so that in particular \Ba{n)\ < n^/^ almost surely for < a < 1. Let 

N{n) - mil 
" v^m[/(l-C/)' 

so that E[Wn] = 0, E[W^] = 1, and 



mil V mil ' m(l — f/) y m(l — U) 

Then, by Taylor's theorem. 



= ull + (l-a)wJ'-^-R,(nW^'-^] (91) 
' m / V V mty m(7 / 



U\^l + R2{n)Wn^^^j , (92) 
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for remainder terms i?i(n), R2{n) (which depend on Wn and U). Similarly, we have 



(1 - Uf 



m — N{n) 



m 



(1 - [/) j^l - (1 - a)Wn 
(l-U) ll-Ri{n)Wn\ 



U 



m(l - U) 



R3in)W^ 



U 



m(l - U) 



U 



m(l - U) 



(93) 
(94) 



By the Lagrange form of the remainder in Taylor's theorem and a continuity argument 
at X = there exists a constant B G (0, oo) such that for /3 = 1 — a, 

(l + x)^-l-px , (l + x)^-l 
> ^ ^ . — > -B, and < ^ ^ < B, 



for all X > —1. Thus we we have, for i G {1, 2, 3, 4}, 

< Riin) < C, 



(95) 



for a finite positive constant C. 

For n > 1, m = n — 1, let En denote the event m~^/^ < U < 1 — m~^^^. From H91|) 
and H93() we obtain 

\B^{n)lE^ = \-Ri{n)Wl{l - U)m-^''' - R:,{n)WlUm-^l''\ I^sIe^ < Cm~'/^W^lE„, 
for some C G (0, oo). By a standard moment generating function calculation, 

E[{N{n) -mUf\U] = mU{l - U) [l5m'^U'^{l - Uf - l30mU'^{l - Uf 

+25mC/(l -U)- 30f/(l - U){1 - 2Uf + l] 
< mU{l - f/)(15m2c/2(l - uf + 25mf/(l - C/) + 1). (96) 

By (|96|) we have that 

E[W11eJ[ < E[iN{n) - mUfm-^U~^il - Uy^\En] = 0(1), 
as ?i — > oo, so from (|97|) we have that 

l3 



S„(?i)li5„ ^ 0. 

Also, from and (jHH) we have, 

\Ba{n)lEc\ = |(i?2(n) - R4{n))WnU^/Hl - Uf'^ 
and so using H95|) we have 

Now, from ()96() we have that 

£;[(W^„;7^/2(i _ [7)1/2)6] ^ rn~^E[{N{n) - mUf] = 0(1), 



(97) 



(98) 
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as n — > oo, so by Cauchy-Schwarz and the fact that -P[-E'^] = 0{n ^/^) we obtain from 
(|U5|) that as n — > oo 

E[\BM'i-Eff]^0. (99) 
So ^ and ^ complete the proof. □ 

Proof of Lemma 14. 5L For n G N, let M„ := log"^ N(n) — logn — logC/. First, suppose 
N{n) > nU/2. We have that 

-log2 < Mnl{N(n)>nU/2}^{nU>2} < - logU. 

Hence 

U^M^l{Nin)>nU/2}MnU>2} < max{(log 2)^, (log [/)2}. (100) 

The expected value of the right hand side of ()100() is finite. Also, U^M^ as n — > oo, 
by continuity and the strong law of large numbers for N{n). Hence, by the dominated 
convergence theorem, 

£'[C/^M^l{jV(n)>nC//2}l{nC/>2}] ^0. (101) 

Also, we have < log^ N{n) < log n, so that — log n < Mn < — log U . Hence 

U^M^ < (log nf + (log U)\ (102) 

so that E[U^M^] = 0{{\ognY). Since P[nU < 2] = 2n~^ , we then obtain, by Cauchy- 
Schwarz, that there exists a finite positive constant C such that 

E[UHlll{N[n)>nU/2}MnU<2}\ < C{\ognfn-^'^ ^ 0, (103) 

as n ^ oo. Now, suppose < N{n) < nU/2. In this case, from (|102|) . and Cauchy- 
Schwarz again, for some finite positive constant C 

E[U^M^l^N^^)^^u/2}] < C{lognf{P[N{n) < nC//2])V2 ^ q, (104) 

as n — > oo, since 

{lognf{P[N{n)<nU/2])^/^< {lognf{P[U< n"^/^] +P[C/> n^^/^,N{n)< nC//2])^/^ 

which tends to zero as n — > oo, using standard bounds for the tail of a binomial distribu- 
tion (see, e.g.. Lemma 1.1 in JHl) ^or the final probability. The results ()10ip . ()103() . and 
(|1()4|) then give (|77|) . The argument for (|78|) is similar. □ 
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